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Abstract 

An automaton is universal if it accepts every possible input. We study 
the notion of it-universality, which asserts that the automaton accepts 
every input starting with u. Universality and it-universality are both 
EXPTIME-hard for non-deterministic tree automata. We propose ef- 
ficient antichain-based techniques to address these problems for visibly 
pushdown automata operating on trees. One of our approaches yields 
algorithms for the universality and u-universality of hedge automata. 

1 Introduction 

The model-checking framework provided many successful tools for decades, 
starting from the seminal work of Biichi. A lot of them rely on the links between 
logics used to express properties on words, and automata allowing to check them. 
Some of these results have been adapted to trees, and more recently to words 
with a nesting structure. 

Visibly pushdown automata (VPAs) have been introduced to process such 
words with nesting [SM04 . VPAs are similar to pushdown automata, but 
operate on a partitioned alphabet: a given letter is associated with one action 
(push or pop), and thus cannot push when firing a transition, and pop when 
firing another. Such automata were introduced to express and check properties 
on control flows of programs, where procedure calls push on the stack, and 
returns pop. They are also suitable to express properties on XML documents 
|KMV07j . These are usually represented as trees, but are serialized as a sequence 
of opening and closing tags, also called the linearization of this document, or 
its corresponding XML stream. 

Processing such streams without building the corresponding tree is permitted 
by online algorithms. It is often crucial to detect at the earliest position of the 
stream whether it satisfies a given property or not. When the property is given 
by an automaton, we call this automaton u-universal when the stream begins 
with word u, and u ensures that the whole stream is accepted by the automaton, 
whatever it contains after u. Indeed, this is a variant of universality of automata: 
universality is e-universality, and amounts to assert that the property will be 
true for every possible stream, and thus can be asserted before reading the first 
letter. While universality of automata is a very strong property, u-universality 
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arises each time an automaton checks the presence of a pattern in trees, and 
this pattern appears in u. 

A delay in the detection of a violation may be exploited to break firewalling 
systems when they use XML for logs [B JLW08 , or to perform a denial of service 
attack on a remote program. In a less critical sense, it can also be used in 
XML validators, to assert validation or non-validation of a document before 
reading it entirely. For program traces, this is usually addressed by online 
verification algorithms operating on words but without considering the nesting 
relation between program calls and returns |KV01j . In the XML setting, some 
streaming algorithms have been proposed. Most of them are not earliest, and 
require a delay between the position where acceptance/refusal can be decided, 
and the position where it is claimed. 

Indeed, testing u-universality is computationally hard on linearizations of 
trees. When the property is specified by a deterministic automaton, this can 
be checked in cubic time. On non-deterministic automata, w-universality be- 
comes EXPTIME-completc GNT09 . Non-determinism naturally arises when 
automata are obtained from logic formulas, as for instance XPath expressions 
with descendant axis [FDL111 IGNTT] . 

In this paper we propose new algorithms for deciding universality and it- 
universality of non-deterministic tree automata on unranked trees accessed through 
their linearization. Our goal is to obtain algorithms that outperform the usual 
approach consisting in detcrminizing the automaton. We want our algorithms 
for u-universality to be incremental, in that, for a letter o, deciding the ua- 
universality should reuse as much information from it-universality computation 
as possible. Indeed we want to find the earliest position allowing to assert ac- 
ceptance, so we have to test it-universality for every prefix u before that point. 

We use antichains to get smaller objects to manipulate, and develop other 
ad-hoc methods. Antichains have been applied recently to decision problems re- 
lated to non-deterministic automata: universality and inclusion for finite word 
automata [DWDHR06J, and for non-deterministic bottom- up tree automata 
[BHH+08]. Some simulation relations are also known on unranked trees [Srb06j 
but it is unclear w hether they can help for our problems, as they do in other 
contexts jACH+lOllDRlO] . N guyen |Ngu09| proposed an algorithm for testing 
the universality of VPAs. This algorithm simultaneously performs an on-the- 
fly determinization and reachability checking by P-automaton. The notion of 
P-automaton introduced in EHRSOO, EKS03 provides a symbolic technique to 
compute the sets of all reachable configurations of a VPA. This algorithm has 
been later improved by Nguyen and Ohsaki [N012 by introducing antichains of 
over transitions of P-automaton, in a way to generate reachable configurations 
as small as possible. Our algorithms for universality are alternative to this one 
since we do not use the regularity property of the set of reachable configurations. 
And our techniques for incrementally testing u-universality are totally new wrt 
this algorithm. A problem similar to it-universality is addressed in [MV09 in the 
context of query answering. Their algorithm applies to non-deterministic VPAs 
recognizing a canonical language of a query, but the automata are assumed to 
only accept prefixes u for which u-universality holds, which is precisely the goal 
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of our algorithms. 

We contribute two algorithms for checking u-universality of VPAs on lin- 
earizations of unranked trees. The first algorithm is by reduction to it-universality 
(and also universality) of hedge automata. Hedge automata are the standard 
automaton model used for unranked trees |BKMW0T) . and runs in a bottom- 
up manner. Hedge automata are similar to XML schema models like DTDs 
or Relax NG. The second algorithm is a direct algorithm on VPAs. Such an 
algorithm was known in the deterministic case |GNT09| . and relied on the in- 
cremental computation of safe states. This algorithm cannot be generalized to 
the non-deterministic case, as sets of safe states do not contain enough infor- 
mation. Instead, we use sets of safe configurations, which may be infinite, but 
manipulated through finite antichains. We show how SAT solvers can be used 
to update these antichains. 

The paper is structured as follows. In Section[5]we define trees, visibly push- 
down automata and the problem of u-universality. Section [3] details our first 
algorithm, relying on a translation to hedge automata. Section 0] contains our 
second algorithm, namely the incremental computation of sets of safe configu- 
rations. 



2 Trees, Automata and n-universality 
2.1 Unranked Trees 

We recall here the standard definition of unranked trees, as provided for instance 



CDG + 07] . Let E be a finite alphabet, and E* (resp. E + ) be the set of all 



words (resp. non empty words) over E. The empty word is denoted by e. Given 
two words v,w e E* over E, v is a prefix (resp. proper prefix) of w if there 
exists a word v' <E E* (resp. v' G E + ) such that vv' = w. Let No be the set of 
all non-negative integers. 

An unranked tree t over E is a partial function t : Ng —> E such that the do- 
main is non-empty, finite and prefix-closed. The domain is denoted by nodes(t) 
and contains the nodes of the tree t, with the root being the empty word e. The 
function t labels each node p with a letter t(p) of E. A node labeled by a e E 
is called an a-node. The set of all unranked trees over E is denoted by T^. 

The subtree of t rooted at node p of t is the tree denoted by t\ pi which domain 
is the set of nodes p' such that pp' 6 nodes(t) and verifying t\ p {p') = t{pp'). For 
a given node p G nodes (t), we call children of p the nodes pi S nodes (t) for 
i £ No] and use the usual definitions for parents, ancestors and descendants. 
The height of a tree is the length of its longest branch (with the length being 
the number of nodes). 

Example 1. Let t x : {e, 1, 2,3,4, 5, 51, 52} — > {a,b,c} such that ti(e) = c, 
= a, h(2) = a, <i(3) = a, = a, i x (5) = b, t x (51) = 6, i x (52) = b. Tree 
t\ is an unranked tree with height 3. It can be represented as in Figure [TJ 

Another example is t 2 : {e, 1, 11, 12, 121, 122, 2, 3, 31, 32, 33, 34} -> {a, b, c} 
as illustrated in Figure [2] 
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Figure 1: Representation of unranked tree t\. 
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Representation of t?,. 



Linearization Trees can be described by well-balanced words which corre- 
spond to a depth-first traversal of the tree. An opening tag is used to notice 
the arrival on a node and a closing tag to notice the departure of a node. For 
each a G S, let a itself represent the opening tag and a the related closing 
tag. The linearization [t] of t G is the well-balanced word over £ U S, with 
£ = {a a G £}, inductively defined by: 



with a = f(e) and the root has n children. We denote by [Ts] the set of lin- 
earizations of all trees in T^. Let PPref(T^) denote the set of all proper prefixes 
of [Te]: PPref(Ty,) = {u e (SUE)* | 3u G (SU£)+ G [T s ]}. 

Example 2. Let t\ and f2 be the trees defined in Example [1] then 
[ti] — caaaaaaaabbbbbbc 
[ta] —aabbcbbcccabbaaabbccaaaa 

2.2 Visibly pushdown automata 

Visibly pushdown automata (VPAs, [AM041 IAM09] ) are pushdown automata 
operating on a partitioned alphabet where only call symbols can push, return 
symbols can pop, and internal symbols can do transitions without considering 
the stack. 

In this paper we only consider languages of unranked trees, so we use VPAs as 
unranked trees acceptors, operating on their linearization (also named streaming 
tree automata GNR08J). This corresponds to the following restrictions. First, 
the alphabet is only partitioned into call symbols S and return symbols S, and 
does not contain internal symbols. Second, all linearizations recognized by these 
VPAs are such that all pairs of matched call a and return b are such that a = b, 



[t] = a [tii] ■ ■ ■ [ti n ] a 
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corresponding to the label of the tree of the corresponding node. Third, all 
linearizations are well-matched and single-rooted, so the acceptance condition 
is that a final state is reached on empty stack. 

Definition 3. A visibly pushdown automaton A over alphabet E is a tuple 
A = (Q,T,,T,Qi,Qf,A) where Q is a finite set of states containing initial states 
Qi C Q and final states Q / C Q , a finite set Y of stack symbols, and a finite set 
A of rules. Each rule in A is of the form q q' with a € E U E, q, q' € Q, 
and 7 6 T. 

The left-hand side of a rule q p G A is (g, a) if a G E, and (q, a, 7) if 
a G E. A VPA is deterministic if it has at most one initial state, and it does 
not have two distinct rules with the same left-hand side. 

A configuration of a VPA A is a pair (q, a) where q G Q is a state and a G T* 
a stack content. A configuration is initial (resp. final) if q G Qi (resp. q G Qf) 
and u = e. For a G E U E, we write (g, a) A (q',o-') if there is a transition 
g q' in A verifying cr' = 7 • cr if a G E, and <r = 7 • er' if a G E. We extend 

this notation to words, by writing (gt^co) ^> {q n ,o' n ) whenever there exist 

configurations ((/j,<7j) such that (<7j_i, <7j_i) {qi,Oi) for all 1 < i < n. From 
u G (SUE)* and the set of configurations ^ C Q xT*, we also define Post u (^a) as 
the set of configurations (g', a') for which there exists a configuration (q, a) G 
such that (g, cr) —> (q',a ! ). 

A ritK of a VPA ^ona linearization [t] = a\ ■ ■ ■ a n of t G Ts is a sequence 
(go j CT o) • • • (q n ,o- n ) of configurations (qi,ai) such that (go, 00) is initial, and for 
every 1 < i < n, (gj_i,<7i_i) (gi,Ci)- Such a run is accepting if (q n ,o~ n ) 
is final. A tree i G is accepted by .4 if there is an accepting run on its 
linearization [f]. The set of accepted trees is called the language of A and is 
written L(A). 

2.3 Universality and ^-universality 

We conclude the preliminaries with the notions of universality and u-universality, 
that we will study in the remainder of the paper. 

Definition 4. A tree automaton A over E is said universal if A accepts all 
trees t € Tj^. Let u e be a prefix of [t ] for some tree t G T"s- The tree 
automaton A is said it-universal if for all trees f £ Tj, if u is a prefix of [t], 
then t is accepted by A. 

In other words, u-universality allows to assert that any tree linearization 
beginning with u is accepted by the automaton. The two previous definitions 
does not depend on the tree automaton A but only on the language L(A). 
Therefore they are independent on the kind of tree automata that are used, as 
soon as they are equivalent. 

Our objective is to propose incremental algorithms for u-univcrsality, in 
the following sense. The linearization [to] of a given tree to is read letter by 
letter, and while A is not u- universal for the current read prefix u of [to], the 
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next letter of [to] is read. For instance Algorithm [TJ shows how u-universality 
is checked incrementally. When processing a new letter, we try to reuse prior 
computations as much as possible. The automaton can be supposed to be not 
universal, otherwise it is u-universal for all such words u. 



Algorithm 1 Checking it-universality incrementally 
function Incremental-u-universality(.4, w) 
i <- 1 

while i < \w\ do 

if A is W\ ■ ■ ■ Wj-universal then 

return True 
end if 
i «- i + 1 
end while 
return False 
end function 



It has been shown in |GNT09j that u-universality is EXPTIME-complete for 
VPAs, but in PTIME for deterministic VPAs. Determinization is in exponential 
time for VPAs, and our algorithms aim at avoiding this exponential blowup. 

An incremental it-universality check as described in Algorithm [T] is very 
useful. First, given a tree to, it allows a streaming membership test of to in A: 
its linearization [to] is read letter by letter, and the algorithm declares as soon 
as possible whether to is accepted by A. Second, when a property (of XML 
documents for instance) is given by a tree automaton, then Algorithm [TJ detects 
at the earliest position of [to] whether to satisfies the property. 



3 Hedge automata approach 

We present algorithms for testing universality and u-universality of a non de- 
terministic visibly pushdown automaton. The approach followed in this section 
is based on a translation of the VPA into an hedge automaton. Algorithms 
with several optimizations are then provided for checking universality and it- 
universality of hedge automata. 



3.1 Hedge automata 

We present the standard notion of hedge automata [BKMWOT] ICDG + 07] . the 
usual automaton model for expressing properties on XML documents. Indeed, a 
hedge automaton resembles a DTD: a DTD is a set of rules like a —> b + c saying 
that children of an a-node must be a non empty sequence of 6-nodes followed 
by a c-node. Hedge automata are a bit more expressive than DTDs, in that 
regular languages operate on states instead of labels, enabling for instance to 
distinguish two kinds of a-nodes. 
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A hedge h over a finite alphabet E is a sequence (empty or not) of unrankcd 
trees over E. The set of all hedges over E is denoted by H^. For instance, given 
the trees t\ and from Example [TJ the sequence t\t<iti is a hedge. 

Definition 5. A hedge automaton over E is a tuple A — (Q,E,Q/, A) where 
Q is a finite set of states, Qf QQ is the set of final states, and A is a finite set 
of transition rules of the following type: 

(a,L,q) 

where a £ E, q £ Q, and L C Q* is a regular language over Q, called a horizontal 
language. 

We denote by Ha the set of all horizontal languages of A. Note that for 
every a £ E and q e Q, we can assume that there is only one L such that 
(a,L,q) £ A. Indeed, we can replace all rules (a,L',q) by one rule (a,L,q) 
where L is the union of all such L' . A hedge automaton is deterministic if for 
all pairs of rules (a, L%, qi) and (a, L 2 , q%) we have L\ n L 2 = or q\ = q%, 

A run of A on a tree t £ Ty, is a tree r G Tq with the same domain as t such 

that for each node p G nodes(r) and its n children pl,p2, . . . ,pn, if a = t(p) 

and q = r(jp), then there is a rule (a, L,q) £ A with r(pl)r(p2) . . . r{pn) £ L. In 

particular, to apply the rule (a,L,q) at a leaf, the empty word e has to belong 

to L. Intuitively, a hedge automaton A operates in a bottom-up manner on a 

tree t: with a run r, it assigns a state to each leaf, and then to each internal 

node, according to the states assigned to its children. We use notation t q 

A 

to indicate the existence of a run ront that labels the root of t by the state 
q. Such a run r is accepting if q is final, i.e. r(e) £ Qf- An unranked tree t is 
accepted by ^4 if there exists an accepting run on it. The language L{A) of A 
is the set of all unranked trees accepted by A. 

Example 6. Let A — (Q, E, Q/, A) be a hedge automaton over E = {a, 6, c} 
with Q = {q a ,q b ,q c ,q f }, Q f = {q f }, and A = {(a,L 1 ,q a ), (b,L 1 ,q b ), (c,L 1 ,q c ), 
(a,L 2 ,qf), (a,L 3 ,q f ), (b,L 3 ,q f ), (c,L 3 ,q f )} where L\ = Q* , L 2 = q b q c and 
L 3 = Q*q f Q*. 

Let £i and £2 the trees from Example Q] Figure [3] represents a run n of ^4 
on ti and two runs, r 2 and 7-3, of A on £2- The runs r% and r2 are not accepting, 
whereas r 3 is accepting. The tree t\ is not accepted by A, whereas t 2 is accepted 
by A. The language of A is the set of all trees having a subtree s which root is 
an a-node and has two children with s(l) = b and s(2) = c. 

3.2 Prom VPAs to hedge automata 

In this section, we describe a translation of VPAs into hedge automata, with 
the aim to transfer universality and u-universality testing of a VPA to a hedge 
automaton. 

Theorem 7. Let A be a VPA. Then one can construct a hedge automaton Ah 
such that for all t G T s , [t] £ L(A) iff t £ L{Ah)- 
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Proof. Let A = (Q, E, T, Qi,Qf, A) be a VPA. We define the hedge automaton 
A H = (Q', E, Q' f , A') such that 

• Q' = Q x Q 

• Q'f = Qi x Q f 

• A' = {(o,L 8ia /,(g,g / )) | 3 7 e r, g ^> s G A and s' ^> g' e A} where 
L s , s > = {(s,qi) ■ (91,92) ' ' ' (9n-l,9n) • (9n,s') I n > 0, S,9i,.. . ,g„,s' G 
Q} U -ftT s ,s', and /C SlS ' = if s ^ s', /f SjS ' = {e} otherwise. 

Notice that each language L SjS i is regular. Let us prove for all t £ T% and 
q,q' G Q that: 

(g,e)^>(g',e) <=► t^(g,g') 

As a consequence, we will have [t] G L(A) iff £ e L(Ah)- 

We proceed by induction on the height of t. We begin with the basic case 
height(t) = 1, i.e. t be a a-leaf for some a G E. Then t =-> (g, g') iff 3s G Q, 7 G F 

A H 

such that g s G A and s g' G A (recall that e G £ s , s ). This is equivalent 
to (g, e )i^(g',e). 

Let i > 1 and suppose that the property holds for all trees of height less 
than i. Let t be a tree of height i such that o = t(e) and the root has n children. 

Let r be a run of Ah on t such that {q,q') = r(e). Then, by definition of 
Ah, there exist gi, • • • , g n +i G Q and 7 G F such that r(l) = (gi,g 2 ), r(2) = 

(92, 93), • • • , r(n) = (q n ,q n+1 ), q q G A and q n+1 q' G A. We know by 
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induction hypothesis that (qi,e) - l —} (gi + i,e) for all 1 < i < n. It follows that 

(go,e) (g n+ i,e) where ft = *|x*J2 ' • • We have also (go, 7) -% (q n+ i,~f) 

since h is an edge, and thus (g, e) ^ > (g', e). 

Suppose now that (g, e) ^> (g', e). So there exist q%, . . . , g rl +i G Q and 7 6 T 

such that g gi G A, g„+i g' £ A, and (gi,7) (gj+1,7) for all i By 
induction hypothesis, tu fe,gi+i) for all i, and thus t c — >■ (g,g')- d 

-Ah -Ah 

As a consequence of Theorem [71 universality and u-universality testing of a 
VPA A is transfered to the hedge automaton Ah- 

3.3 Checking universality 

A standard method to check universality of a hedge automaton is to determinize 
it, complement it, and check for emptiness. As determinization is in exponential 
time we propose in this section an antichain-based algorithm for 

checking universality without explicit determinization. 



Such an algorithm has been proposed in |BHH + 08] for finite (ranked) tree 
automata. In the context of hedge automata, additional difficulties have to be 
solved due to the fact that the accepted trees are unranked. 

In our approach, the main idea is to find as fast as possible one tree rejected 
by the hedge automaton (if it exists) by performing a kind of bottom-up implicit 
determinization. Antichains will limit the computations. 

3.3.1 Macrostates and Post operator 

To test universality of a hedge automaton A, we have to check that all the trees 
of Ty, belong to L(A). Instead of working with trees we work with sets of states, 
which are called macrostates. A macrostate is associated with each tree t: it 
is the set of all the states q labeling the root of a run of A on t, i.e. such 
that t «— ► q. To compute the macrostates, we make bottom-up computations by 

applying a Post operator defined as follows. 

Definition 8. Let A = (Q, S, Q /, A) be a hedge automaton. A macrostate is a 
set of states P C Q . A macrostate word tt = P1P2 ■ ■ ■ P n , n > 0, is a word over 
the alphabet 2® . We denote by If the set {pip2 • • -Pn | Pi G Pi,V«, I < i < n) . 
Given a G S and tt a macrostate word, let 

Post a {Ti) = {qeQ \ 3(o, L, q) G A : L n vf ^ 0} 

For 2? C 2^ a set of macrostates, let 

Post(3>) = {Post a (n) oe£,7re 3>*} U 3 d 

and Post*(3 > ) = Uj> Post* (3 s ) such that Post (3 s ) = 3 d , and for all i > 0, 
Post 1 (3 s ) = Post(Post 1 - 1 1 
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When 7r = e, Post a (e) is the set of all states that can be assigned to an a- leaf 
of a tree, with a G S. If an a-node has n children to which the macrostates Pi, 
. . . , P n have been assigned, then Post a (Pi • ■ ■ P n ) is the set of all states that 
can be assigned to this node. The next lemma is immediate. 

Lemma 9. Let A = (Q,E,<5/,A) be a hedge automaton and i € be such 
that its root is an a-node with n children. Let Pi — {q G Q \ tu c — > q} for 

1 < i < n. Then 

Postal ■■■ P n ) = {q e Q\t^q}. 

A 

Given a set of macrostates, Post(£P) is the set of all macrostates that 
belong to or can be obtained via Post a (ir) with any letter a G E, and any 
macrostate word 7r = P1P2 ■ ■ ■ P n with Pi G J 2 *, Vi. More precisely, we have: 

Lemma 10. Let A = (Q,S,(3/,A) be a hedge automaton and i > 1. .A 
macrostate P belongs to Post 1 (0) iff there exists a tree t £ Tj wif/i height(t) < i 
suc/i £/iat P = {<? G Q | £ <?}. 

Proof. We proceed by induction on i. 

The basic case, i = 1, directly follows from Post 1 (0) = {Pos£ Q (e) | a G X 1 } 

and Post a (e) = {q I f <^-» q} with t being an a-leaf. 
A 

Let i > 1 and suppose that the property holds for all j, 1 < j < i. 
(=>-) Let P G Post 1 ^). If P G Post^ 1 (0), then the property holds by induction 
hypothesis. Otherwise there exist n > 0, Pi, . . . , P„ G Post 1 - 1 (0), and a G S, 
such that P = Post a {Pi ■■■P n ). By induction hypothesis, Vfc, 1 < k < n, 
3tk G Ps such that height(tk) < i and P& = {q \ tk q}- Let t be the tree with 

the a-root and the n subtrees ti, . . . , t n . Then height (t) < i and P = {q \ t ^ q} 

A 

by Lemma |9l 

(<=) Let f e T s with height(t) < i and P = {<? G Q | £ ^ <?}. If height(t) < i, 

then by induction hypothesis P G Post 1 - 1 (0) C Post l {%). Otherwise let a be the 
label of the root of t and t\i,. . . , t\ n its n subtrees. Let Pf. = {q G Q I tu. q}, 

1 < fc < n. As height{t\ k ) < i, we have by induction hypthesis that P k G 
Post'- 1 (0). By Lemmall P = Post a (P 1 ■ ■ • P„), and thus P G PosP'(0). □ 

Given a tree f G Ts we define Pt as the macrostate Pt = {q G Q | £ 

q}. More generally, given a hedge ft, = t\t%---t n G Pe we denote by tth 
the macrostate word 77^ = P tl P t2 ■ ■ -Pt„. The previous lemmas indicate that 
Post*($) = {P t I i S Tg}, and more generally that (PosP(0))* = {tt^ | h G 

The next proposition is an immediate consequence of Lemmas |H] and 1101 

Proposition 11. Let A = (Q,E,Q/,A) be a hedge automaton. Then A is 
universal iff MP G Post* (0), P n Q f ^ 0. 
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3.3.2 Relations and universality algorithm 

Our method for checking universality of a hedge automaton is to compute 
PosP(0) by iteratively applying the Post operator. However to get Post(3 21 ), 
we have to compute SP* which is an infinite set of macrostate words. To circum- 
vent this problem, we represent a macrostate word by a relation as described 
below, with the advantage that the set of relations is now finite. 

We first introduce some notation. Let A — (Q,S,Q/,A) be a hedge au- 
tomaton and Ha be the set of horizontal languages appearing in its transition 
rules. We recall that these languages are regular. Let L G Ha and Bl be a 
(word) automaton over the alphabet Q that accepts L. Let Sl be its set of 
states, II its set of initial states, and Fl its set of final states. We denote by Ba 
the automaton which is the disjoint union of all the automata Bl with L € %a- 
Its set of states is denoted by Sa = [j Sl. A run in Ba from state s G Sa 

Len A 

to state s' £ Sa labeled by word w G Q* is denoted by s s' . 

Definition 12. Let A = (Q, E, Q /, A) be a hedge automaton andir a macrostate 
word. Then re\(n) C Sa x Sa is the relation 

rel(7r) = {(s, s') \ s s' with w G n}. 

In other words, if it = Pi ■ ■ ■ P n with P{ C Q for all i, then (s, s') belongs to 
rel(7r) iff there is a path in Ba from s to s' that is labeled by a word pi • ■ • p n G 
7f. The notation rel is naturally extended to sets W of macrostate words as 
re \(W) = {rel(Tr) | n G W}. 

Notice there are finitely many relations r C Sa x Sa , since Sa is a finite set. 
If 3% is a set of relations r C 5*_4 x Sa, then ^* denotes the set of all relations 
obtained by composing relations in 3%: 8%* = {r x o r 2 o • • • o r n \ n > and n G 
^ for all 1 < i < n}. In particular S%* contains the identity relation id SA over 
Sa, obtained when n = 0. 

Lemma 13. Le£ A = (Q, £,<3/,A) &e a /ledge automaton. If a set of 
macrostates and 3% a set of relations such that rel(^) = 3$, then re\{3 2> *) — 31* . 

Proof. Let us prove that for any macrostate word 7r = P\---P n , re\(n) = 
rel (Pi) o ■ • • o rel(P n ) ; the lemma is an immediate consequence. 

Let (s,s') G rel (Pi • • -P n ), that is, 3w = p\ • ■ -p n £ ? : s « s'. Let s = 
Si, S2, ■ • ■ ,s n , s n+ i = s' G Sa be such that s» & Sj+i for all i. As G Pi and 
(sj, Sj+i) G rel(Pj), it follows that (s, s') G rel(Pi) o ■ • • o rel(P„). 

Conversely, let (s, s') G rel(Pi) o • • • o rel(P„). Let s = S\,S2, ••• ,s n , s n+ i = 
s' G Sa be such that (sj,Sj + i) G rel(Pj) for all i. By definition, for all i, there 
exists K G Pi such that Sj Sj+i. So for w = p\ ■ ■ -p n , we have s\ s n+ i 
showing that (s, s') G rel (Pi • • • P„). □ 

The Post operator is adapted to relations in the following way. 
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Definition 14. Let A = (Q,T,,Qf, A) be a hedge automaton, r C Sj( x o> 
relation, and a G £ a letter. Then 

Post a (r) = {q G Q 3(a, L, q) G 4, 3(s, s') G r : s G ix and s' G F L }. 

Lemma 15. Le< a G S andir be a macrostate word, then Post a (iT) = Post a (rel(7r)). 

Proof. For a G £ and 7r a macrostate word, we have 

Post a (Tr) = {q G Q I 3(a,L,q) G A : Lnf ^ 0} 

= {a G Q I 3(a, L, q) G A, 3s, ,s' G SU, 3w G 7f : s ~-> s , s G II and s' G 
= {qeQ \ 3(a, L, q) G A, 3(s, s') G rel(-Tr) : s e I L and s' G F L } 
= Post a {re\{n)). 

□ 

Lemma 16. Let 2? be a set of macro states, then Post(S^) = {Post a (r) \ a G 
S, r G rel(^)*}U^. 

Proo/. By definition, Post{,9>) = {Post a (n) \ a G S,tt G <^>*}. By Lemma 
IT5l this set is equal to {Post a (rel(7r)) | a <E S,n E S?*} which is equal to 
{Post a {r) a G S, r G rel(^>)*} by Lemma[l3] □ 

We are now able to propose an algorithm to check universality of hedge 
automata. With Algorithm [21 the set Pos£*(0) is computed incrementally and 
the universality test is performed thanks to Proposition 1111 More precisely, at 
step i, variable S 3 is used for Post l (fh) and variable Si* is used for re\(S g )* . 
We compute Sf* with Function CompositionClosure, and then possible new 
macrostates with {Post a (r) | a G S, r G SS*}. The algorithm stops when no 
new macrostate is found or the hedge automaton is declared not universal. 

Let us detail Function CompositionClosure(^*, S?') which computes the 
set (Si* U Si 1 )*. In Algorithm El we show how to compute (M* U S%')* given 
the inputs Si* and S%' , without recomputing Si* from Si. Initially, Relations is 
equal to S?* US?' and will be equal to (S?* US?')* at the end of the computation. 
ToProcess contains the relations that can produce new relations by composition 
with an element of Relations. 

Proposition 17. Given Si* and A?' , Algorithm^ computes (S?* U Si')* . 

Proof. Let Relations be the set computed by Algorithm |3l Clearly, Relations C 
(M* I) Si')*. Assume by contradiction there exists r that belongs to (M* UM')*\ 
Relations. Then r ^ Si* U and we can suppose wlog that r — r' 2 o r[ with 
r[, r' 2 G Relations. Notice that at least one element among r' l7 r' 2 has been added 
to ToProcess during the execution of Algorithm [3l since otherwise r[ , r' 2 G Si* 
and then r G S?* . If r[ is the last one (among r[,r' 2 ) to be popped from 
ToProcess, then the relation r 2 o r[ is added to New Relations, which leads to a 
contradiction. The conclusion is similar if r 2 is is the last one to be popped. □ 
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Algorithm 2 Checking universality 



function Universality(„4) 

& 4- 

&'<-{ids A } 
repeat 

■^new 4- {Post a {r) \ ae Z,r e£H*} 
if BP e & new ■ P n F = then 

return False / / Not universal 
end if 

M' 4-ve\{<?> new \,<?)\M* 
if M' ^ then 

■ * ~ r -*^ — new 

SI* 4- CompositionClosure(^*,^") 
end if 
until M' = 

return True // Universal 
end function 



Algorithm 3 Computing (3S* \J M')* 

function CompositionClosure^* , M') 
Relations 4- St* U &?J 
ToProcess 4- 3%' 
while ToProcess ^ do 
rel 4- Pop(ToProcess) 
NewRelations 4- 
for r G Relations do 

NewRelations <— NewRelations U{ro rel, rel o r} 
end for 

ToProcess 4— ToProcess U (NewRelations \ Relations) 
Relations <— Relations U NewRelations 
end while 
return Relations 
end function 
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3.3.3 Antichain-based optimization 

In this section we explain how to use the concept of antichain for saving compu- 
tations. We show that it is sufficient to only compute the C-minimal elements 
of Post*{%) for checking universality. 

Consider the set 2^ of all macrostates, with the C operator. An antichain 
& of macrostates is a set of pairwisc incomparable macrostates with respect to 
C. Given a set 3? of macrostates, we denote by \_&\ the C-minimal elements 
of similarly we denote by the (--maximal elements of J 2 *. A set 3? of 
macrostates is C-upward closed (resp. (--downward closed) if for all P 6 and 
P C P' (resp. P' C P), we have P 1 G 3&. The same notions can be defined for 
a set of relations (instead of macrostates). 

Definition 18. Let A = (Q,T,,Q y, A) be a hedge automaton. Let ^ C 2^ be 
a set of macrostates, let 

Post u (0>) = \_Post{^)\ 

and Post?,^) = Ui> PostU^ 1 ) such that Postf\(&) = [3^\, and for all 
i > 0, Post^ {&>) = Post u {Post^ 1 (&>)). 

Lemma 19. Given 3^ a set of macrostates, for all P 6 Post* (3^), there exists 
P' £ Post?, {&) such that P' C P. 

Proof. The proof is done by induction on i such that Post*(3^) = Ui>oPost t (3?), 
and on the next two observations: 

• Given a £ E, and r, r' two relations over Sj±, if r C r' then Post a {r) C 
Poster'). 

• Let 7*1 , • • ■ , r n , r[ , • ■ • , r' n be relations over , if C r£ , VI < i < n, then 
n o • ■ • o r n C r'i o • • • o r' n . 

□ 

Notice that thanks to Lemma 1 1 61 given an antichain of macrostates 3^, we 
can compute Post^ (3 s ) as [{Post a (r) \ a e S,r <E [re\(^)*\} U 3*\. We have 
the next counterpart of Proposition 1111 

Proposition 20. Let A = (Q,£,(3/, A) be a hedge automaton. A is universal 
if and only if MP e Post^ (0), P n Q f ^ 0. 

Proof. The proof is based on Proposition [TTJ 

As Postjj (0) C Post* ((I)), the proof is immediate. 
(•*=) Suppose that VP £ Pos£*(0),Pn Q/ ^ 0. Let P' e Pos£*(0). By 
Lemma [TJ BP 6 Posi* (0) : P C P'. It follows that P' n Q/ ^ 0. □ 

Algorithm [4] checks whether a given hedge automaton is universal by com- 
puting incrementally Post?, (0). It is an adaptation of Algorithm [2j 
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Algorithm 4 Checking universality 



function Universality (A) 
&> <- 

^min <" {**su} 
repeat 

^„ etu <- L{-Post a (r) I a G 27, r G ^* ram }J 
if 3P G ^„ ew : P n F = then 

return False / / Not universal 
end if 

M> ^re\(^ new \^)\<%* min 
if M' ± then 

5» <- u ^„ ew J 

•^min *~ LCOMPOSITIONCLOSURE(^^ irl ,^")J 

end if 
until M' = 

return True // Universal 
end function 



Notice that in Algorithm 01 to compute l/elf^)*], we first make a call to 
Function CompositionClosure and then we only keep the C-minimal ele- 
ments of the result. An optimisation could be, at each step of the Composi- 
tionClosure computation, to only consider the minimal elements. 

3.4 Checking ^-universality 

In this section, given A a hedge automaton and u ^ e a word in PPref (2e) , we 
propose a method to check whether A is u-universal. This method is incremen- 
tal, as explained in Section 12.31 As in the previous section, we first propose our 
approach, then transform it into an algorithm (thanks to relations) , and finally 
propose some optimizations. 

We need the following notation. Let u be the current read proper prefix of 
[to] for a given tree t . If u — ai[/ii] a 2[^2] ' • ■ a n[h n ] with ctj G E, hi G H-£, for 
1 < i < n, then open(u) = aici2 • ■ • a n - In other words, a\, a,2, . . . ,a n are the 
read open tags which closing tags have not been read yet. The partial reading 
of to according to u indicates a current list of ancestors respectively labeled by 
a%, 02, . . . a n as depicted in Figure [4] 

Given u, let Wi = ai[hi] ■ ■ ■ ai_i[/ii— 1], for 1 < i < n, such that wi = e. The 
incremental method is based on the usage of some sets 

such that each X Wiai is defined from X Wiiail , with the underlying idea that A 
is u^ai-universal iff X Willi is empty. This permits to check u-universality when 
u ends with a E-symbol. Moreover, we will see that each element of X Wiai is 
a witness of some word v such that the tree t with [t] — wiaiv is not accepted 
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Figure 4: Current reading of a tree to according to the prefix 
ai[hi}a 2 [h 2 } ■ • ■ a n [h n }. 

by A. For words u ending with a E-symbol, we will explain at the end of this 
section how the test of WiCii [/^-universality can be easily performed using the 
set X WiCli . 

3.4.1 Incremental approach 

Let us give the definition of X WiCLi for all i. We begin with the basic case i = 1, 
i.e. with set X a . 

We use notation Pt s for Post*(%) and n# s for (Post* (0))* as introduced in 
Section EXT] (recall that Post*(9) = {P t \ t G T s } and (Post*(®))* = {n h \ h G 
He} by Lemma ITUf . Given a set W of macrostate words, we define Pref(W) as 
the set {tt G n ffs | 3tt' G n Ks : tttt' G 

Basic case We need to define X a such that X a = iff _/l is a-universal, i.e. 
all trees t such that [t] — a[h]a with h G iTn, are accepted by A. The test of 
a-universality is performed in two steps. We first collect all macrostate words 
TXh G II# E (see Lemmas [91 and [TO)) . Then for each of them we compute Post a (irh) 
and check whether Post a (iTh) fl Q/ ^ (see Proposition [TT]) . If for some 7^, 
we have Post^Wh) PI Qr = 0, then 7i>, is a witness of non a-universality of A, 
since a [ft,] a is not accepted by A. More precisely, we have the next definition 
and proposition. 

Definition 21. Let A = (Q,£,Q/, A) be a hedge automaton, and let a € £ be 

a letter. We define 

X a = {lTE II He I Postal) R Q/ = 0}. 

Proposition 22. ^4 is a-universal iff X a = 0. Moreover, if X a is not empty, 
for all tt G X a , let h G H^, &e swc/i //ia£ ir = tx^. Then a[h]a G ps \ L(_4)]. 
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Let us now proceed with the general case, that is, the definition of X Wiai 
with i > 1. For all proper prefixes WjCij of WiOi, we can suppose that A is not 
Wjdj -universal, otherwise A would be trivially m^ai-universal. We define X Wi(li 
and then, explain how to check i^Oi-universality knowing X Wiai . 

General case Let wa with a g S. We first define -X^, a . Let w = w'a'[/i'] 
with a' £ £ and /i' G iJs- We suppose that .A is not w/a'-universal, and that 
X w > a ' ^ 0- Moreover X w i a i contains a witness of a word v such that the tree t 
with [i] = w'a'v is not accepted by A. 




f] 



(a) u/a'[/i']a (b) hedge g with [cj] = [h']a[fci]a[/i2] 



Figure 5: Current reading according to the prefix wa 

Let us define the set X wa from the set X w i a i. In Figure [5] (a), we indicate 
the current reading of a tree according to wa: an internal node labeled by a' 
with a sequence of subtrees equal to h! followed by a child labeled by a. With 
this figure, we notice that A is not wa- universal iff there exists hi,ki G 
such that for the hedge g with [g] = [h']a[hi]a[h2], we have tt 9 G X w > a > (see 
Figured] (b)). This observation leads to the next definition of X wa - 

Definition 23. Let wa G PPre/(T s ) with w = w'a'[h'}, a, a' G £ and h' G i?s- 

Lei „4 = (Q, S, Q/, A) &e a hedge automaton. We define 

X wa — {tt G IIf/ s I w h >Post a (ir) G _Pre/(X„/ a /)}. 
As for the basic case (see Proposition [52]), we have the next proposition. 

Proposition 24. A is wa-universal iff X wa = 0. Moreover, if X wa is not 
empty, then X wa = {ir^ G Hh s I '■ wa[h]av G [Is \ L(.4)]}. 

Proof. We proceed by induction on w to prove that X wa — {n^ G Hh s I ^ v '■ 
wa[h]av G [7s \ L(.4)]}. The basic case, w = e, directly follows from Proposi- 
tion m 

Let w = w'a'[h'] with a' G £ and /i' G i/s- Suppose that the property holds 
for X w , a ,, i.e. X w = {n h > G n Hs | 3v' : w'a'[h']v' G [T s \ L{A)}}. 
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(C) Let TT h G X wa . By definition, 3h',h" G H s : ir h , Post a {n h )Tr h n G X w i a >. 
Then, by induction hypothesis, 3v' : w' 'a'[h']a[h]a[h"]a'v' G [T s \ L(A)}. Let 
u = [/i'Vd', then wa[h]av G [T s \ L(A)]. 

(D) Let 7T/j € II # E such that 3u : wa[h]av G [Te\L(„4)]. So there exists a word 1/ 
and hedges h',h" such that w'a'[/i']a[/i]a[ft,"]a'w' G [T s \ £(.4.)]. By induction 
hypothesis, 7r g G X w i a i with [5] = [ft/]a[/i]a[/i"], and thus tt^ G □ 

In this section, given a tree to and the current read prefix u of [to], we 
have shown how to test incrementally for it-universality as follows. Suppose 
that u = ai[hi]a2[h2] ■ • ■ a n [h n ] with a t G G for 1 < i < n, and let 

io s ; = ai[/ii] ■ • • a,i-i[hi-i], for 1 < i < n. We have defined set X a and then each 
set X WiCti , 1 < i < n, from I^^a^j, such that .4 is u>;a.;-universal iff X WiUi is 
empty. 

It should be noted that it is also possible to test whether A is u)j£ij[/ii]- 
universal thanks to set X Wiai . Indeed, by Proposition l24[ A is w^a^^-universal 
iff $ir G U Hs : tt Ih tt G X WiQi . 

3.4.2 Algorithm for checking u- universality 

In this section, we propose an algorithm for u-universality checking. As done 
before for universality in Section [3321 we need to represent a macrostate word 
7r by the relation rel(7r). Definitions |2"T1 and |2"31 are rephrased as follows. Given a 
set Y of relations, we define Pref(Y) as the set {r £ re\(HH s ) | 3r' € rel(n^f E ) : 
rr' £7}. 

Definition 25. Let .4 = (Q,E,Q/,A) be a hedge automaton, and let wa € 
PPre/(T E ) iwt/i a G S. 

t. 7/iy = e, we define Y a = {r G rel(II ffs ) | Post a (r) n Q/ = 0}. 

If w ^ e, given w — w'a'[h'] with a' G S and /i' G He, we define Y wa — 
{r G rel(n ffE ) | rel(^)rel(Post a (r)) G Fre/(*W)}- 

Lemma 26. Y^, a = rel(X,„ a ). 

Proof. The proof is done by induction on u>. 

The basic case, Y a = re\(X a ), follows from Lemma fT5l 

Let w — w'a'[h'] with a' G S and ft/ G Suppose that IVa' = re K^w'a') 
holds. Notice that for it G X wa and it' G IIff E , if rel(7r) = rel(7r'), then it' G X wa 
(see Lemma fTS]) . We have for r = rel(7r) G rel (n^ s ) : 

t G Y wa 3r' G rel(n Hs ) : rel(7r h ./)rel(Post a (r))r / G 1W 

3tt' G n ffs : re](7r/ l /Post (7r)7r / ) G re\(X w , a ,) 
^> it G rel(X ttia ) 

It follows that Y" wa = re\(X wa ). □ 

The next proposition is the equivalent of Propositions [22] and [24] as a con- 
sequence of Lemma [26j 
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Proposition 27. A is wa-universal iff Y wa is empty. 

By definition of Y wa , it follows that A is wa[/i]-universal, with h G Hs, iff 
$r e rel(n Hs ) : re\(n h )r G F wa . 

Let us now describe an algorithm to test whether a hedge automaton A is 
u-universal. We recall that to is a given tree and u its current read prefix. This 
algorithm is incremental and thus has already checked that A is not wa-universal 
for all non-empty proper prefixes wa of u thanks to Proposition 1271 

More precisely, let u = ai[h{]a2[h2\ ■ ■ ■ a n [h n ] and Wi = ai[hi] ■ ■ ■ aj_i[/ij_i], 
for 1 < i < n, and suppose that for all i the sets Y WiCli have been computed 
and seen to be non empty. A stack is used to store all triples (Y Wiai , rel(/jj), a{), 
1 < i < n, with the triple (Y Wnan , re\(h n ),a n ) at the top of the stack. The stack 
has a depth equal to the length of open(u). 

In Algorithm [5j four functions are called according to the letter that is 
currently read in to knowing that u is the last read prefix of t$. If it is the 
first letter a (resp. last letter a) of [to], then Function OpenRoot(o) (resp. 
CLOSERoOT(a)) is called. Otherwise either Function NEXTOPENTAG(a) or 
NextClosedTag(ci) is called according to whether a or a is the next read 
letter. 

Function OPENRoOT(a) computes the set Y a as defined in Definition [25l If 
Y a is empty, then A is declared a-universal. Otherwise, the stack is initialized 
with the triple (Y a , id, a) 

Function CLOSERoOT(a) pops the stack to get its unique triple (Y a ,r,a) 
(since a is the last letter of [to]). It checks whether to = ua is accepted by the 
automaton with the emptiness test of Post a (r) n Qf. 

If u 7^ e and the letter read after u is a with a £ E, then Function Nex- 
tOpenTag(<2) reads the triple (Y',r',a') at the top of the stack and computes 
the Y ua from the set Y 1 (as in Definition l25j). If Y ua is empty, then A is declared 
ua-universal. Otherwise, the triple (Y ua , id, a) is pushed on the stack. If the 
letter read after u is a with a G E, and ua ^ to, then Function NextClosed- 
Tag(o) pops once the stack to get the triple (Y, r, a) (notice that a is the closing 
tag of a in this triple). It then modifies the triple (Y 1 , r', a') at the top of stack, 
by replacing r' by r" — r' o re\(Post a (r)) (see Figured] (b)). If there does not 
exist s G rel(n/f E ) such that r" s G Y', then A is declared to be ua-universal. 

These four functions return True as soon as they can declare that A is it- 
universal for the current read prefix u of [to]. 

3.4.3 Antichain-based optimization 

In this section we explain how to use the concept of antichain to avoid some 
computations when checking for u-universality. In particular we show that it 
is sufficient to only compute the C-maximal elements of set Y wa as defined in 
Definition 

Lemma 28. Let wa G PPref(Ts) with a G S, Y wa is a (--downward closed set. 

Proof. We proceed by induction on w. Notice that for r,r' G rel(n// E ) and 
a G E, if r' C r, then Post a (r') C Post a (r). 
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Algorithm 5 Functions used for checking it-universality incrementally 
function OPENRoOT(a) 

Y <- 

for r e rel(Ilij s ) do 

if Post a (r) H Q f = then 
Y <-Y\j{r} 

end if 
end for 
if Y = then 

return True // it-universal with it the current read prefix 
else 

Stack «- 

PuSH(S'tocfc, (F, id, a)) 
end if 
end function 

function CLOSERoOT(a) 
(Yir.a) «- Pop(S'tocfc) 
if Post a (r) n Q/ = then 

return False // t is not accepted 
else 

return True //to is accepted 
end if 
end function 

function NEXTOPENTAG(a) 
(Y',r',a') <r- TOP(Stack) 

Y «- 

for r E rel (Hh e ) do 

if r' o rel(Pos£ a (r)) e Pref(Y') then 

end if 
end for 
if Y = then 

return True / / it-universal with it the current read prefix 
else 

Pvsu(Stack,(Y, id, a)) 
end if 
end function 

function NextClosedTag((z) 
(y,r,a) «- POP(S'tocfc) 
(y',r',a') «- Pop(Stacifc) 
r' «- r' o rel (Post a (r)) 
if $s e rel(n ffE ) : r' o s e Y' then 

return True / / it-universal with it the current read prefix 
end if 

Pvsu(Stack,(Y',r',a')) 
end function 

2D 



Consider the basic case where w = e. By definition Y a = {r G rel(IIjy s ) 
Post a (r) (1 Qf = 0}. By the previous remark, Y a is a C-downward closed set. 

Let w = w'a'[h'], with a' S E and ft.' G .He. Let r £ and r' G rel(njy s ) 
such that r' C r. Let us show that r' G l^ a . As r G Y wa , 3r" G re^ILg-j,) : 
re\{n hl )re\(Post a (r))r" G Yw As Post a (r') C Post a (r) and Y ttV is C- 
downward closed, it follows that rel(7r/j')rel(_Pos£ a (r'))r" G Y w / a / and then 

r' G Y wo . □ 

As Y wa is C-downward closed, it can be described by the antichain [Y^] 
of its maximal elements. Let w — w'a'[h'] with a G E and hi G -He, the next 
lemma shows that it is possible to compute Y wa from without knowing 

the whole set Y w i a i. 

Lemma 29. For r G rel(n^f E ) 7 r G Y wa iff there exist r' G Lrel(Il// K )J and 
s G fi^'o'l such that re\(TTf l ')re\(Post a (r))r' C s. 

Proof. 

3r' G rel(II HE ) : re\{n h ,)re\{Post a {r))r' G Y w , a , (Def. US]) 
3r' G rel(II HE ),3s G [3W] : rel(7r /l /)rel(Posi (r))r / C s 
3r' G Lrel(n ffE )J,3s G \Y w , a ,^ : rel(7r /l /)rel(Post (r))r' C s 

□ 

Based on the previous lemma, Algorithm [6] is an optimized version of Func- 
tion NextOpenTag(u, a) which computes Y — \Y wa \ from Y' = \Y w > a >~\ with- 
out computing the entire set Y wa . The idea is to have a set, called Candidates, 
containing all elements that could be potentially in Y. Initially, it is the set 
rel(n^ E ). Otherwise, suppose that Y has been partially computed, then Can- 
didates is the set rel(n ffs ) \ {r' | 3r G Y : r' C r}. Function MaximalEle- 
ment( Candidates) returns a maximal element of the set Candidates. 

4 Safe configurations approach 

We present an algorithm for testing u-universality of a non-deterministic visibly 
pushdown automaton A. This algorithm is a generalization of the algorithm 
for the deterministic case [GNT09] . adding several optimizations to avoid huge 
computations. As in Section [33J the algorithm is incremental in the sense that 
the linearization [to] of a given tree to is read letter by letter, and while A is not 
u- universal for the current read prefix u of [to] , the next letter of [to] is read. 

4.1 Safe configurations 

In the deterministic case |GNT09j , the algorithm relies on the incremental com- 
putation of the set of safe states. In the non-deterministic case, safe states are 
not enough to decide u-universality. Indeed In |GNT09j . safe states are com- 
puted according to the unique run of the deterministic automaton on u. In fact, 
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Algorithm 6 Optimized Function NextOpenTag 
function OptNextOpenTag(u, a) 
(Y',r',a') 4- Top(Stack) 
Y <- 

Candidates <— rel(n^f s ) 
while Candidates ^ do 

r <— MaximalElement( Candidates) 

if 3r" £ Lrel(n Hs )J,3s £Y':r'o re\(Post a {r)) o r" C s then 

Candidates -tr- {r' £ Candidates \ r' $2 r} 
else 

Candidates <— Candidates\{r} 

end if 
end while 
if y = then 

return A is wa-universal 
else 

Pvsn(Stack, (Y, id, a)) 
end if 
end function 



safe configurations (q, a) are considered, but all these configurations have the 
same stack a here, so only states q have to be stored. When the automaton is 
non-deterministic, we may have several runs on u, and each of them may use a 
different stack. All these stacks have to be considered for testing u-universality, 
so we cannot consider only states. 

Therefore, we have to consider safe configurations, or more precisely sets of 
safe configurations as described in the next definition. We use notions about 
VPAs that are defined in Section 12.21 as well sets of configurations that are 
antichains with respect to C, or C-upward (resp. C-downward) closed sets (see 
Section \MM- 

Definition 30. Let A be a VPA and ? C Q x T* be a set of configurations. 
Let u G PPref (Is) be a prefix. 

• ^ is safe for u if for every v such that uv £ [2e], there exist (q,cr) € c € 
and p £ Qf such that (q, a) A- (p, e) in A. 

• ^ is leaf-safe for u if for every v = av' with a £ X such that uv £ [Ts], 
there exist (q, a) £ c € and p £ Qf such that (q, a) A- (p, e) in A. 

We write Safe(u) for \ ^€ is safe for u} and LSafe(u) for | is leaf-safe 
for u}. 

Intuitively, as stated in Theorem [32] below, if ^ is the set of configurations 
reached in A after reading u, then A is u-universal iff is safe for u. Indeed, 
for every possible v, one can find in ^ at least one configuration leading to an 
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accepting configuration after reading v. We first note that, from the definitions, 
if a set of configurations ^ is safe (resp. leaf-safe) for u, then a larger set c €' is 
also safe (resp. leaf-safe) for u. 

Lemma 31. Safe(u) and LSafe(u) are (--upward closed sets. 

Let Reach(u) denote the set of configurations (q,cr) such that (qojCo) — > 
(q,o~) for some initial configuration (go, 00) of A. 

Theorem 32. A is u-universal iff Reach(u) £ Safe(u). 

Proof. (=>■) Assume that A is u-universal. Consider the set If of configurations 
(g, a) of A such that there exists v £ (S U £)*, q± £ Qi and qf £ Qf verifying 
uv £ [Ts] and (qi, e) A (g, cr) A (qf,e). We have ^ C Reach(u). 

Let v be such that 6 Pe]. As A is u-universal, there exists a configuration 
(g,c) S c € such that (g, cr) A (?/, e ) with gy s Q/. Hence ^ £ Safe(u). By 
Lemma [3T1 we get Reach(u) G Safe(u). 

(<=) Assume now that Reach(u) £ Safe(u), and let v be such that iti> £ ps]. 
As Reach(u) £ Safe(u), there exists (g, cr) € Reach(u) and p £ Qf such that 
(g, cr) — ^ (P;e)- Thus, to £ L(,4), and ^4 is u-universal. □ 

4.2 Incremental definition of safe configurations 

In this section, we detail how set Safe(u) of safe configurations can be defined 
from set Safe(u') with v! a proper prefix of u. In this way, while reading the 
linearization [to] of a given tree to, set Safe(u) with u prefix of [to], can be 
incrementally defined. In the next section, we will turn this approach into an 
algorithm. 

4.2.1 Starting point 

The starting point is to begin with Saje(a) for which we recall the definition. 
Safe(a) = {tf \ Vh £ H s ,3q f £ Q f ,3(q,a) £ : (q,a) ^ (q f ,e)}. 

4.2.2 Reading a letter a £ S 

When reading an a £ S, we can retrieve safe configurations from prior sets of 
safe configurations: 

Safe(ua) = Safe(u) 

where u' is the unique prefix of u such that u = u'a[h]. Indeed as shown by 
Lemma [33] below, we have Safe(u'a[h]a) — Safe(u'). 

Hence, from an algorithmic point of view, we just have to use a stack to 
store these safe configurations. When opening a, we put Safe(u') on the stack, 
and when closing a, we pop it. As h is a hedge, the stack before reading a is 
exactly the stack after reading a. 

Lemma 33. Ifh£ H^, then Safe(u[h]) = Safe(u) and LSafe(u[h]) = LSafe(u). 
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Proof. Q) Assume ^ £ Safe(u), and let v be such that u[h]v £ [Tg]. As h is 
a hedge, we have uv £ [T^]. As c € £ Safe(u), there exists (q,a) £ & such that 
(q,a) A (p,e) with p G Q/. So ^ £ Safe(u[h}). 

(C) Conversely, assume £ Safe(u[h}). Let w be such that uv £ [Is]. We 
also have u[h]v £ ps], so there exists (q,o~) £ ^ such that (g, c) A (p, e) with 
p £ Qf. Thus <?f £ Safe{u). 

The proof is the same for LSafe(u[h\) = LSafe(u), except that we only 
consider v of the form av' . □ 

In the rest of Section [H we only treat sets Safe(ua) since the way of com- 
puting sets Safe(ua) has been just detailed. The case of sets Safe(ua) is much 
more involved. 

4.2.3 Reading a letter a £ £ 

When reading an a £ S, two successive steps are performed, with leaf-safe 
configurations as intermediate object: 

Safe(u) St ° p 1 > LSafe(ua) Stcp 2 > Safe(ua) 

We now detail Step 1 and Step 2, i.e. how LSafe(ua) can be defined from 
Safe{u), and how Safe(ua) is defined from LSafe(ua). Proposition [34l gives a 
first idea of these links. Equivalence (JXJ) states that a set of configurations ^ is 
leaf-safe for ua iff after performing a Postal) we get a safe set of configurations 
for u. Equivalence ([2]) states that safe configurations for ua are those from 
which traversing any hedge leads to a leaf-safe set of configurations, i.e. one 
can safely close the a- node. Proposition l34l thus relates sets Safe(u), LSafe(ua), 
and Safe(ua), however backwardly. Proposition 1381 hereafter will relates them 
in the right direction. 

Proposition 34. Let ua £ PPre/(T s ) with a £ E. 

c i £ LSafe(ua) ^ Postal) £ Safe{u) (1) 
£ Safe{ua) •<=>■ \/h £ hl^Post [h \( c <g) £ LSafe{ua) (2) 

Proof. fU=*0 Let ^ e LSafe{ua) and = Postal). Let us show that T £ 
Safe(u). By Lemma [33j it is sufficient to prove that c €' £ Safe(uaa). Let v such 
that uaav £ ps]. As ^ G LSafe(ua) and aw starts with a G S, there exists 

(g,er) G ^ and {q',cr') such that (g, cr) A (g',0 7 ) A (p, e) for some p £ Qf. By 
definition of Postal) we have (g',c') G c €' and thus ^" G Safeiuaa). 

|U <=) For the converse, let ^" = Postal) £ Safe(u) — Safe(ua~a). Let 
us show that ^ G LSafe{ua). Let w be such that uav £ ps] and v = bv' . 
We necessarily have a = b. As c €' £ Safe(uaa), there exists (q',er') G ^" such 

that (<?',cr') — ► (p, () with p G Q/. By definition of Postal), there also exists 

(g, c) G "to such that (g, cr) —> (g', cr') and thus (g, a) v ~ av > (j> ; e ) with p £ Qf. 
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(2J=^) Let c g £ Safe{ua) and h £ H s . Let us show that c g' = Post^tf) is 
in LSafe{ua). Let v such that uav £ [Ts] and u = &?/. We must have a = b. 
We also have ua[h]av' £ ps]. As € Safe(ua), there exists (g,cr) G ^ such 

that (g,cr) (g' ', cr') "~ a " > (p, e) with p G Q/. By definition of Pos^i (^) , 

([3] •<=) Let us assume that for every hedge h £ Postn^i^f) £ LSafe(ua). 
Let us show that £ Safe(ua). Let v be such that uav £ [7e]. Then we 
have waw = ua[/i]at/ for some /i G H^. As "jf = Pos^i (^) £ LSafe(ua), 
av' starts with a £ E and udav 1 £ [Is], there exists (q',<r') £ If' such that 
(g',cr') — — > (p, e) for some p £ Qf. Hence, by definition of Posi^] (^) , there 
also exists (q, cr) G ^ such that (g, a) (g', a') (p, e) with p £ Qf. □ 

We propose now the notion of predecessor in a way to get Step 1 and Step 2 
in the right direction. 

Definition 35. Let ^,%r" be two sets of configurations, a £ E and h £ H^. 
• <€ is ana-predecessor of <jf' ifV(q',a') £ 3{q,a) £ (a, cr) A (g',cr'). 

. ^ is an ^-predecessor of «" i/V(g',a') G 3(g,cr) G 
(<?>')■ 

Let Preda^f) = \ is ana-predecessor oftf'} and Pred h {tf') = {<af \ 
is an h-predecessor ofrg'}. 

From their definitions, the sets of predecessors are C-upward closed. 

Lemma 36. Predai^") an< ^ P re dh( < &') are £-upward closed sets. 

Predecessors closely relate to the Post operator. 

Lemma 37. is ana-predecessor of Postal). Iftf is ana-predecessor ofW' 
then C Postal) . Both properties also hold for Post^] ■ 

We can now rephrase Proposition [34] in terms of predecessors. 

Proposition 38. Let ua £ PPre/(T s ). 

c € £ LSafeiua) 3 C S" £ Safe(u), ^ is ana-predecessor of c (o' (3) 

c tf G Safe(ua) V/i G ffj;, 3^" £ LSafe(ua), c € is a h-predecessor of c S" (4) 

Proof. J3J =>-) Let ^ G LSafe(ua). Then by Proposition^! Posb^) £ Safe(u). 
Moreover, is an a-predecessor of Postal) by Lemma I5T1 

([3] <=) Let "to be an a-predecessor of with ^" G Safe(u). By Lemma [37] 
ST' C Postal). By Lemma EU we also have Postal) £ Safe(u), so %f G 
LSafe(ua) by Proposition l34l 

(j4j Same proofs, except that a has to be replaced by h, for all /i G i?s. □ 
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Proposition [35] can be used to perform Step 1 and Step 2 of our method. It 
states that safe sets of configurations are only among predecessors of prior safe 
sets of configurations. However, the number of hedges to consider in equiva- 
lence (HJ) is infinite. We use relations to overcome this. Also the size of Safe(u) 
may be huge and not all configurations of Safe(u) are crucial for checking u- 
universality. We use antichains to have a representation of Safe(u) and to avoid 
computations of elements which are not crucial. These two concepts are ex- 
plained in the following in a way to get an algorithm for incrementally checking 
M-universality. 

4.3 An algorithm for ^-universality 

4.3.1 Antichains 

Let \_Safe(u)\ denote the set of elements of Safe(u) which are minimal for C, 
similarly for LSafe{u). These antichains are finite objects. 

Proposition 39. \_Safe(u)\ and [LSafe(u)\ are finite and only contain finite 
sets of configurations. 

Proof. We begin with the following observation. Let v be such that [uv] G Ts 
and (q,o~) A (p, e) with p G Qf. Let v! = open(u) (recall that open(it) is the 
word obtained from u by removing all factors that are linearizations of hedges). 
Let v' be the word obtained from v in the same way. Then \u'\ = \v'\ and 

Kl = H- 

Let ft G Safe(u). Then by definition 

Vi>, uv G [Ts] ==> 3(q, er) G ^, (q, a) A (p, e) with p&Qf. 

If is minimal with respect to C, then every (q, a) G ^ is used for at least one 
v in the previous definition. Now by the previous observation, each such (q, a) 
belongs to Q x rl u 'L Hence <af C Q x rl u 'l, and thus both ^ and [Safe(u)\ are 
finite. 

The same arguments hold for proving that \_LSafe(u)\ is finite and contains 
only finite sets of configurations. □ 

We now try to use these antichains in the starting point, and in Steps 1 
and 2 of our approach. 

4.3.2 Step 1 with antichains: from [Safe(u)\ to [LSafe(ua)\ 

For the two steps, the goal is to adapt Proposition [38] so that it uses [Safe(.)\ 
instead of Safe{.), and \_LSafe{.)\ instead of LSafe{.). We begin with Step 1. 
Implication (=>■) of equivalence © can be directly adapted. 

Proposition 40. Let ua G PPref(T s ). 

c € G [LSafe(ua)\ 3^' G [Safe(u)\ , c € is ana-predecessor of c €' 
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Proof. Let c g G [LSafe(ua)\ and let = Postal). We know from Proposi- 
tion [34] that <£' € Safe(u). Let % C <*f' such that % G |Sa/e(u)J. From the 
definition of ^" we get: 

Vc' £ <af', 3c G tf, c^c' 
We build % from these cef but for c' G ^ ': 

% = {c G | 3c' e c ^> c'} 
Figure E] illustrates the construction. % is an a-predecessor of ^J, so using 




Figure 6: Construction of 



Proposition [38l we get G LSafe(ua). Furthermore, c #o C ^ e [^^(na)], 
so = and ^ is obtained as an a-predecessor of € L^ a / e ( M )J ■ ^ 

Proposition FUJI gives us a way to compute \_LSafe{ua)\ from L^o/e (it) J : it 
suffices to take all a-predecessors of elements of [Safe(u)\ and then limit to 
those predecessors that are C-minimal. We can even only consider minimal a- 
predecessors of \_Safe(u)\ in the following sense: ^ is a minimal a-predecessor 
of W if for all <€" a-predecessor of "T" C <«? =^ c £" = c <g . We finally 
obtain: 

Corollary 41. 

|_L5a/e(ua)J = [{^ ^ is a minimal a-predecessor of ' G L^ a /e(w)J}J 

4.3.3 Step 2 with antichains: from |_.Ma/e(ua)J to [Safe(ua)\ 

The second step for computing \_Safe(ua)\ from [Sa/e^)] relies on the introduc- 
tion of antichains in equivalence (|4]) of Proposition [38] Implication (=^) holds 
with antichains. 

Proposition 42. Lei ua G PPref(T s ). 

c € G |_<SVj/e(wa)J ==>• V7i G i/s, 3^" G |_-Wa/e(ua)J , ^ is a h-predecessor of c &' 
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Proof. The proof is in the same vein as for Proposition^!)! Let c € G [Safe(ua)\, 
and /i e ff s - Let % = Post [h] ( c g). By Proposition [Ml % G LSafe(ua). Let 
#h C ^ such that ^' G L^a/efWlJ. We know that Vc' £ ^ 3c G ^ such 

that c c'. We define % = {c G <«? | 3c' £ <g£', c c'}. For every ft G H s , 
'tfh is a /i-predecessor of G LSafe(ua). Consider c ia U = {J heHj , then is 
also a ft-predecessor of c €'^. Using Proposition [351 we have %j G Safe(ua). As 
%j Q y> an d ^ G [Safe(ua)\ , we also have that = Hence ^ verifies that 
Vh £ He, 3^"' G [i5a/e(ua)J such that 'g' is a /i-predecessor of c £" . □ 

Note that this proof does not use the fact that ua ends with a symbol in £, 
so Proposition S21 also holds when replacing ua by u. 

Similarly to Proposition l40i we can restrict /i-predecessors to consider to only 
minimal ones: ^ is a minimal h-predecessor of c €' if for all c £" ft-predecessor of 

<af" G <g =±, W = <g. We obtain: 



Corollary 43 

[Safe(ua)\ = 



'€ = I) mi/i a minimal h-predecessor oj ' c €' G L-^<9a/e(na)J 



This definition does not provide an algorithm, as it still relies on a quantifi- 
cation over an infinite number of hedges h G In fact, only a finite number 
of such hedges needs to be considered. The reason is that a hedge does not 
change the original stack during the run of a VPA, so a hedge can be considered 
as a function mapping each state q to the set of states obtained when traversing 
h from q. Formally, we have the next definition. 

Definition 44. For every h G rel^ is the function from Q to 2® such that 
q' G r&\ h (q) iff(q,a) (q',a) for some a G T* . 

The number of such functions is finite, and bounded by \Q\ • 2'®'. These 
functions naturally define an equivalence relation of finite index over H^: 

h ~ h' rel/j = rel^. 

Let us note H for a subset containing one hedge per ~-class. We have \H\ < 
| Q | • 2^1. The next lemma indicates that the computation of /i-predecessors can 
be limited to h G H. 

Lemma 45. For every h G Hy,, ^ is a h-predecessor of iff there exists 
hi G H,h ~ h' , such that is a h' -predecessor of^'. 

Proof. Let us recall the definition of ft-predecessor: ^ is a ft-predecessor of c € l 

if V(</,cr) G <(f / , 3(g,cr) G <6 , (q, a) \ (q',a). Hence if h ~ h! then ^ is a 
ft-predecessor of iff ^ is a /i'-predecessor of c € l . □ 

We propose an algorithm for computing such a set H from a VPA A. Algo- 
rithm [7] is based on the definition of hedges, adapted to relations: 
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• e is the empty hedge, and re\ € (q) — {q} for every q G Q. We write this 
function id,Q. 

• if fti,/i2 are two hedges, then fti/12 is a hedge, and r&\h t h 2 — re '/i 2 re '/n- 

• if ft. is a hedge and a 6 S, then aha is a hedge, and re\ a ha{q) is the set of 
states q' such that there exists 7 g T verifying: 

(q>, e) (p, 7) and (p, 7) A (</, e) with p' G re\ h (p) . 

Algorithm [7] uses the variables ToProcess and Functions with the following 
meaning. Functions contains initially the identity relation zg?q; at the end of 
the computation, it contains all functions rel/j, for ft G ToProcess contains 
all the newly constructed relations, and these relations are used to create other 
new relations as described in the previous definition by induction. 



Algorithm 7 Computing all functions reU, for ft G 
function HedgeFunctions(.A) 
Functions {id,Q} 
ToProcess <— {idq} 
while ToProcess ^ do 
fct <- Pop (ToProcess) 
NewFunctions <— 
for / G Functions do 

NewFunctions <!— NewFunctions U {/ o fct, fct o /} 
end for 
for a G S do 

/ <- h II h ma P s £ ver y <7 e Q to 

for q p G A and p' — ^ q' G A with p' G fct(p) do 

/(«) <- /(<?) u M 

end for 

NewFunctions NewFunctions U {/} 
end for 

ToProcess <— ToProcess U (NewFunctions \ Functions) 
Functions Functions U NewFunctions 
end while 
return Functions 
end function 



Proposition 46. Algorithm^ computes the set {relft, | ft G -He}- 

Proof. Let Functions be the set computed by Algorithm[7] Clearly, Functions C 
{reU I ft G -ffs}- Assume for contradiction that there exists r = reU with ft G Hs 
such that r G" Functions. Clearly, r ^ idQ, and we can suppose wlog that either 
r = r' 2 o with r'^r^ G Functions \ {idq}, or there exists r' G Functions such 
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that for all q, r(q) is the set of q' with q — > p 6 A, p — > q' G A and p' G r'(p). 
Consider the first case. When they have been constructed by Algorithm [3 both 
r[ and r' 2 have been added to ToProcess and to Functions. After the last element 
(among r[ and r' 2 ) is popped from ToProcess, then r = r 2 o r[ is built during 
the loop on f £ Functions, which leads to a contradiction. We also have a 
contradiction in the second case by considering the loop on a G E. □ 

Consequently we can rephrase our definition of \_Safe(ua)\ from [LSafe(ua)\ 
given in Corollary [43] by restricting the quantification on h to the finite set 
H. Therefore we obtain a finite procedure for computing lSafe(ua)\ from 
[LSafe(ua)\ : 



Proposition 47. 



[Safe(ua)\ 



| ^ = "if/, im'i/i % a minimal h-predecessor oj c to' G [LSafe (ita)J 



h£H 



4.3.4 Starting point with antichains 

It remains to explain how to compute Safe(a). Clearly, by definition of iJ, we 
can compute [Safe(a)\ as follows: 

Proposition 48. 

[Safe(a)\ = [{tf | Vh G i/, % G Q/, 3(g, a) G # : (g, a) ^> ( 9/> e)} 



4.4 Algorithmic improvements 

The previous section resulted in a first algorithm to incrementally compute sets 
of safe configurations. This algorithm can be improved by limiting hedges to 
consider, and optimizing operators and predecessors to be computed. The goal 
here is to avoid the complexity of the on-the-fly determinization procedure. 



4.4.1 Minimal hedges 

A first improvement is obtained by further restricting hedges to consider. Indeed 
it suffices to consider minimal hedges wrt their function rel/,.. Formally, let us 
write h < h! whenever re\h(q) Q re\h'(q) for every q G Q. We denote by \H\ 
the <-minimal elements of H. Notice that Algorithm [7] that computes the set 
{relfc, | h G H} can be easily adapted to compute the set of its minimal elements, 
such that NewFunctions and ToProcess are restricted to antichains of minimal 
elements. 

From the definition of /i-predecessor, for every c €' £ Q x T* we have: 

^.-predecessor of c €' and h < h' => ^.'-predecessor of ^" (5) 
This property can be used to replace h G H in Proposition [47] by h G [H J . 
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Proposition 49. 



[Safe(ua)\ = ■{ & \ = c <oh with % a minimal h-predecessor of & G [LSafe(ua)\ 

he[H\ 

Proof. Let S denote the set 

f^ 7 | ^ = M ^ with % a minimal /i-predecessor of 'if' € L£<Sa/e(wa)J > 

Let £ S. We have: ^ = ^U-^U^U^ U • • • U # v . Let us show 

" - ' i . 3- 

that <& hl U • • • U <& hk U ^/ U • • • U G 5. By induction, this will prove 

that tf hl U • • • U G 5. We have h' n e H \ [H\ , so there exists h % G [H\ 
such that hi < h' n . As % i is a minimal /^-predecessor of an element If' in 
\LSafe (ua)\ , it follows from ([3]) that is also a minimal /^-predecessor of < £' . 
So Sf fcl U • • • U Sf fcjt U <€ K U ■ • • U U^6S. ' □ 

We have also the next proposition. 
Proposition 50. 

[Safe(a)\ = [{tf | V/i £ [H\,3q f G Q/,3(g,ff) G Sf : (q,a) ^ ( g/ ,e)}J . 
4.4.2 An appropriate union operator 

Proposition l49l expresses that every set of configurations in [Safe(ua)\ is 
the union of % with h G [H\ . We introduce a new operator to improve the 
readability and find new properties. 

Definition 51. Let S be a finite set, and A, B GG 2 aS \W . The setAUB G 2 2i 
is defined by: 

AU B = {aUb | aeA and b G B} 

Operator U builds sets obtained by taking one set of each of its operands, 
and performing their union. It is obviously associative and commutative. Notice 
that the elements of A, B are supposed to be non-empty sets. This will always 
be the case in the following algorithms using this operator. Proposition l49l can 
now be rewritten as follows. 

Proposition 52. 

[Safe{ua)\ = [_\ {tf h \ tf h is a minimal h-predecessor offf G [LSafe (ua)\ } 
he[H] 

When combined with operator |_-J , clauses of the U operator can be splitted, 
so that U is to be computed on smaller sets. 
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Lemma 53. [A U B\ = [(A n B) U (A \ B U B\ A)\ 



Proof. (D) Let G [(A f)B)U(A\B U B\ A)\ . Then G AU B. For 
contradiction, let us assume that there exists c £' C ^ such that ^" G [A LI -BJ . 
If G A n B then fe(AnB)U(i\B U B \ A), which contradicts <if. So 
^' ^ 4 n B, and assume wlog that = a U b with oei\B and b e B. If 
6 G A then & G A n B C A U B and 6 C but this contradicts c i' . lib £ A 
then ^'eA\BUB\A so ^'eylnBU(A\B U B\A), and ^" C , which 
contradicts 

(C) Let^G LAUBJ. Let us first show that c g G (AnB)U(A\B U B\A). 
If ^ G A n B this is direct. Otherwise = a U 6 with »ei\B and & G B 
(the other case is symmetric) . If 6 G A then 6 e iflB C iUB and d C 
which contradicts the definition of tf. So 6 G B \ A, and <^ G A \ B U B\ 
A. Now, assume for contradiction that there exists c €' C ^ such that G 
[(ylnB)U(4\B U B\A)J. Then, according to (D), <*f' G \_AUB\, which 
contradicts the definition of . □ 

Corollary 54. IfACB, then [AUB\ = [A\ . 

The U operator also simplifies the definition of [Safe(a)\. From this new 
definition, an algorithm follows. 

Proposition 55. [Safe(a)\ — \_\ he ^ H ^ A h ^ with 

A h = {{(q,<r)} | q G Q,a G T : 3q f G Q f : (g, a) ^> (q f ,e)} . 

Proof. 1. Every element of |J fte ^j Aft belongs to 5a/e (a). Thus UheL ff J 
C Safe (a). 

2. Let us show that for each 'rf in Safe(a), there exists ^" G U/ieL^J 
such that <g' C <*?. Let G Safe (a). By definition, for all h G L#J 

there exists (qh,o~h) G ^ and g/ G Q/ such that (qh,o~h) — -> (qf,e). Let 
^' = {(%,crft) | ?i G L^J}. Then C ^ and ^" G ^ because 

{(9^i,)}e4,Vfc. 



3. Assume that there exists ^ G 



A h 



\ [Safe(a)\. By 1., there 



UhelH\ 

exists in [Safe(a)\ such that ^ C and by 2., there exists ^" G 
Uh,£iHj -^-/i such that If' C ^ C in contradiction with the definition of 



Therefore 



u 



C |£tfe(o)J. 



such that 



4. Let <if G [Safe(a)\. By 2., there exists c £' G UfteL^J ^ 

<g" C # . By 3., it follows that ^ = <6" and thus [Safe (a) J C \Jhe[H} Ah 



□ 
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4.4.3 Using SAT solvers to find minimal predecessors 



The computation of minimal predecessors is the key operation for Step 1 and 
Step 2 which respectively compute [LSafe(ua)\ from [Safe(u)\ and [Safe(ua)\ 
from [LSafe(ua)\ using the following formulas (see Corollary |4"T1 and Proposi- 
tion [ 



\_LSafe{ua)\ = [{f \ f is a minimal a-predecessor of f' £ [Safe(u)\}\ 
[Safe(ua)\ = j j {fh | % is a minimal ^.-predecessor of f £ [LSafe(ua)\} 

h€[H] 

We propose a method to compute minimal predecessors by performing mul- 
tiple calls to a SAT solver. A SAT solver is an algorithm used to efficiently test 
the satisfiability of a boolean formula tp, that is to check whether there exists a 
valuation v of the boolean variables of tp that makes <p true. In this case we say 
that v is a model of tp, denoted by v \= p. 

Most of the SAT solvers require that the boolean formula given as input is a 
conjunction of clauses (where a clause is a disjunction of literals, and a literal is 
a variable or its negation). Such formulas are said to be in conjunctive normal 
form (CNF). In the following all input formulas will be in CNF. 

We first detail a method to compute all minimal a-predecessor of f. It is 
also valid to compute all minimal /i-predecessors of f. 



Minimal predecessors. We recall that f is a a-predecessor of f if for all 

(q',cr') £ f, there exists (q,a) £ f such that (q,o~) A- (g',cr'). Let us write 
<Pa( e if') for the following boolean formula: 

Mf) = A V x °> 

c'e«" _Bv , 

C — 7C 

and let iv be the valuation such that v^{x c ) — 1 iff c £ f . Then we immediately 
obtain that: 

V V h= ^('f) iff ^ is an a-predecessor of f' 

We define an ordering over valuations as follows, in a way to have a notion 
of minimal models equivalent to minimal predecessors. Let tp be a CNF boolean 
formula over the set V of boolean variables, let v and v' be two valuations over 
V. We define v' < v iff for all variables x £ V, v'(x) = 1 ===> v{x) = 1. We 
denote v' < v if v' < v and v' ^ v. We say that a model v of is minimal if for 
all model v' of tp, we have v' < v w' = v. We get the next characterization 
which also holds for /i-predecessors. 

Lemma 56. f is a minimal a-predecessor off' iff is a minimal model of 
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We can now explain how to compute all the minimal a-predecessors of c £' , 
or equivalently all the minimal models of formula tpa( c <o'). 

Let tp be a CNF boolean formula over V. First, we explain, knowing a model 
v of if, how to compute a model v' of tp such that v' < v (if it exists). Consider 
the next formula tp': 

tp' = tp A ( /\ -.as) A ( V -us) 

where Vb (respectively Vi) is the set of all variables x £ V such that v(x) = 
(resp. w(x) = 1). If tp 1 has a model i>', it follows from the definition of tp 1 that 
v' is a model of tp such that u' < v. Otherwise, v is a minimal model of 
So from a model of tp we can compute a minimal model of tp by repeating the 
above procedure. 

Second, let us explain how to compute all the minimal models of tp. Suppose 
that we already know some minimal model v of tp, and let Vi be the set of 
variables x G V such that v(x) = 1. Consider the formula 

V = ( \/ nl). 

Then a model v' of y/, if it exists, is a model of tp such that neither v' < v (since 
v is minimal) nor v < v' (by definition of tp'). With the previous procedure, 
we thus get a minimal model of tp that is distinct from v. In this way we can 
compute all minimal models of tp. 

This approach has been detailed for minimal a-predecessors. It also works 
for minimal ^.-predecessors. 

Step 1 with SAT solvers. The computation of the set [LSafe(ua)\ from 
[Safe(u)\ can also be done using SAT solvers. Indeed, suppose that given G 
[Safe(u)\ , we have computed all the minimal a-predecessors of ^[ as explained 
before. Let ^ be another elements of [Safe(u)\. As done previously, we can 
express by boolean formulas, that we want to compute minimal a-predecessor 
of ^2 that are either strictly included in some minimal a-predecessor of or 
incomparable with all minimal a-predecessors of 

Step 2 with SAT solvers. The computation of the set [Safe(ua)\ from 
lLSafe(ua)\ can be done as in Proposition[52]by using operator U and exploiting 
its properties. 

Under the hypothesis that e g [-^J ; an alternative is possible with Proposi- 
tion [49] stating that [Safe(ua)\ is equal to 

| ^ — (J with % a minimal /i-predecessor of ^" G [LSafe(ua)\ > 
he[H} ) 

It is based on the following observations. Fix some ^ and C S' in the previous 
equality. First, if ft. = e, then c €' is the only minimal /i-predecessor of c €' and thus 
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c 6" C < €. Second we know by the proof of Propositionl3"9"lthat c € C QxTl" I where 
v! = open(u). Therefore, instead of computing ^ as a union UhelHJ we can 
compute it starting from c €' and adding elements of Q x r'" I one by one, until 
we get an element c to of Safe(ua). By the way it is constructed, 'io £ L'S' a / e ( ua )J • 
We can check that such an element belongs to Safe(ua) with Proposition l34l 
by testing for all h £ [H\, whether there exists ( €" £ \_LSafe(ua)\ such that 
Post\f l \( c tf) 3 . To get the whole set [Safe(ua)\ , we need to consider all the 

possibilities to enlarge 'rf' with elements of Q x r' 11 L This task can be done 
efficiently with the help of SAT solvers (with ideas similar to the ones developed 
above). 
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